Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Interaction for Style-constrained OCR

Identifieur interne : 000F34 ( Main/Exploration ); précédent : 000F33; suivant : 000F35

Interaction for Style-constrained OCR

Auteurs : Sriharsha Veermachaneni [Italie] ; George Nagy (informaticien) [États-Unis]

Source :

RBID : Pascal:08-0459036

Descripteurs français

English descriptors

Abstract

The error rate can be considerably reduced on a style-consistent document if its style is identified and the right style-specific classifier is used. Since in some applications both machines and humans have difficulty in identifying the style, we propose a strategy to improve the accuracy of style-constrained classification by enlisting the human operator to identify the labels of some characters selected by the machine. We present an algorithm to select the set of characters that is likely to reduce the error rate on unlabeled characters by utilizing the labels to reclassify the remaining characters. We demonstrate the efficacy of our algorithm on simulated data.


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en" level="a">Interaction for Style-constrained OCR</title>
<author>
<name sortKey="Veermachaneni, Sriharsha" sort="Veermachaneni, Sriharsha" uniqKey="Veermachaneni S" first="Sriharsha" last="Veermachaneni">Sriharsha Veermachaneni</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>SRA division, ITC-IRST</s1>
<s2>Trento</s2>
<s3>ITA</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
<country>Italie</country>
<wicri:noRegion>Trento</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Nagy, George" sort="Nagy, George" uniqKey="Nagy G" first="George" last="Nagy">George Nagy (informaticien)</name>
<affiliation wicri:level="2">
<inist:fA14 i1="02">
<s1>RPI ECSE DocLab</s1>
<s2>Troy, NY</s2>
<s3>USA</s3>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName>
<region type="state">État de New York</region>
</placeName>
<placeName>
<settlement type="city">Troy (New York</settlement>
<region type="state">État de New York</region>
</placeName>
<orgName type="lab" n="5">Institut polytechnique Rensselaer</orgName>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">INIST</idno>
<idno type="inist">08-0459036</idno>
<date when="2007">2007</date>
<idno type="stanalyst">PASCAL 08-0459036 INIST</idno>
<idno type="RBID">Pascal:08-0459036</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000258</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000526</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000276</idno>
<idno type="wicri:Area/Main/Merge">000F47</idno>
<idno type="wicri:Area/Main/Curation">000F34</idno>
<idno type="wicri:Area/Main/Exploration">000F34</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a">Interaction for Style-constrained OCR</title>
<author>
<name sortKey="Veermachaneni, Sriharsha" sort="Veermachaneni, Sriharsha" uniqKey="Veermachaneni S" first="Sriharsha" last="Veermachaneni">Sriharsha Veermachaneni</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>SRA division, ITC-IRST</s1>
<s2>Trento</s2>
<s3>ITA</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
<country>Italie</country>
<wicri:noRegion>Trento</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Nagy, George" sort="Nagy, George" uniqKey="Nagy G" first="George" last="Nagy">George Nagy (informaticien)</name>
<affiliation wicri:level="2">
<inist:fA14 i1="02">
<s1>RPI ECSE DocLab</s1>
<s2>Troy, NY</s2>
<s3>USA</s3>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName>
<region type="state">État de New York</region>
</placeName>
<placeName>
<settlement type="city">Troy (New York</settlement>
<region type="state">État de New York</region>
</placeName>
<orgName type="lab" n="5">Institut polytechnique Rensselaer</orgName>
</affiliation>
</author>
</analytic>
<series>
<title level="j" type="main">Proceedings of Electronic Imaging Science and Technology</title>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<title level="j" type="main">Proceedings of Electronic Imaging Science and Technology</title>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Accuracy</term>
<term>Algorithms</term>
<term>Automatic classification</term>
<term>Error rate</term>
<term>Human operator</term>
<term>Man</term>
<term>Optical character recognition</term>
<term>Pattern recognition</term>
<term>Signal classification</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Algorithme</term>
<term>Homme</term>
<term>Reconnaissance optique caractère</term>
<term>Taux erreur</term>
<term>Classification automatique</term>
<term>Précision</term>
<term>Opérateur humain</term>
<term>Reconnaissance forme</term>
<term>Classification signal</term>
<term>0130C</term>
<term>4230S</term>
</keywords>
<keywords scheme="Wicri" type="topic" xml:lang="fr">
<term>Homme</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">The error rate can be considerably reduced on a style-consistent document if its style is identified and the right style-specific classifier is used. Since in some applications both machines and humans have difficulty in identifying the style, we propose a strategy to improve the accuracy of style-constrained classification by enlisting the human operator to identify the labels of some characters selected by the machine. We present an algorithm to select the set of characters that is likely to reduce the error rate on unlabeled characters by utilizing the labels to reclassify the remaining characters. We demonstrate the efficacy of our algorithm on simulated data.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>Italie</li>
<li>États-Unis</li>
</country>
<region>
<li>État de New York</li>
</region>
<settlement>
<li>Troy (New York</li>
</settlement>
<orgName>
<li>Institut polytechnique Rensselaer</li>
</orgName>
</list>
<tree>
<country name="Italie">
<noRegion>
<name sortKey="Veermachaneni, Sriharsha" sort="Veermachaneni, Sriharsha" uniqKey="Veermachaneni S" first="Sriharsha" last="Veermachaneni">Sriharsha Veermachaneni</name>
</noRegion>
</country>
<country name="États-Unis">
<region name="État de New York">
<name sortKey="Nagy, George" sort="Nagy, George" uniqKey="Nagy G" first="George" last="Nagy">George Nagy (informaticien)</name>
</region>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000F34 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000F34 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     Pascal:08-0459036
   |texte=   Interaction for Style-constrained OCR
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024